In this notebook we present a short analysis of price and carbon intensity for European countries. We first pull data from the Electricity Map, we then display a simple scatterplot.
In [12]:
import arrow
from altair import *
import pandas as pd
import numpy as np
import requests
from utils import *
endpoint = 'http://www.electricitymap.org'
r = requests.session()
We will use the functions defined in the utils python file to fetch data from the Electricity Map. See a README here.
We will use the functions get_state
and get_price
to return both carbon intensity and electricity price per country and per hour.
In order to return all the countries available in the API, we use the function fetch_state
:
In [88]:
# We return here a list of all countries in the Electricity Map
state = fetch_state(arrow.get('2017-02-12'), 1440)
countries = state['countries'].keys()
We then pass this list of countries to our functions. Note that fetching 2 months of data at an hourly level for all countries will take a while. If you want to use exported csv, you can download them from here
In [189]:
# This will take a while to run.
#prices = get_price(countries, '2017-01-22', '2017-02-20', 60)
#co2 = get_state(countries, '2017-01-22', '2017-02-20', 60)
#You can also import csv directly
prices = pd.read_csv('prices.csv', index_col=0)
co2 = pd.read_csv('co2.csv', index_col=0)
A quick look at the number of rows per country, suggest that we should do some data cleaning. In particular we have too little data points for GB and RO:
In [173]:
prices.groupby('country').count().sort_values('price').head()
Out[173]:
In [191]:
prices = prices[-prices['country'].isin(['RO', 'GB'])]
We also make sure to remove duplicates:
In [202]:
co2 = co2.drop_duplicates(['country', 'timestamp'], keep='first')
prices = prices.drop_duplicates(['country', 'timestamp'], keep='first')
We also find that our price service is missing 20 countries out of 38 countries available in the Electricity Map. We display a list of countries below:
In [261]:
pd.Series(countries)[-pd.Series(countries).isin(prices['country'].unique())]
Out[261]:
We can now merge our two dataframes:
In [193]:
carbon_price = pd.merge(prices, co2, how='left', on=['country', 'timestamp'])
carbon_price.head()
Out[193]:
We can start by analyzing the price of electricity per country. We start by plotting timeseries for a subset of countries.
We notice the peak of price at the end of January in France or in Belgium:
In [199]:
Chart(prices[prices['country'].isin(['GR', 'FR', 'BE', 'SI'])],
description="Price of electricity (Euro per MWh) for four EU countries",
).mark_line(
).encode(
column='country:N',
x=X('timestamp:T'),
y=Y('price:Q'),
)
We also notice the sharp variation of price for a given day. We know that electricity demand varies during the day. Let's then analyze price per hour during the day:
In [203]:
# We create a new column with the hour of the day
def hour(t):
return arrow.get(t).hour
prices['hour'] = prices['timestamp'].apply(hour)
We can have a better view of the variation of the price of the electricity during the day in the erro bar graph below:
In [208]:
LayeredChart(prices,
description='A error bar plot showing mean, min, and max of price for FR electricity production',
layers=[Chart().mark_rule().encode(
x='hour:O',
y=Y('min(price):Q',
axis=Axis(
title='Price Ahead',
),
),
y2='max(price):Q',
), Chart().mark_tick().encode(
size=Size(
value=5.0,
),
x='hour:O',
y=Y('min(price):Q',
axis=Axis(
title='Price Ahead',
),
),
), Chart().mark_tick().encode(
size=Size(
value=5.0,
),
x='hour:O',
y=Y('max(price):Q',
axis=Axis(
title='Price Ahead',
),
),
), Chart().mark_point().encode(
size=Size(
value=2.0,
),
x='hour:O',
y=Y('mean(price):Q',
axis=Axis(
title='Price Ahead',
),
),
)],
).transform_data(
filter=(expr.df.country == 'FR')
)
It is also interesting to conduct the same analysis at the carbon intensity level. The strong variation of carbon intensity for a given day suggests for instance that a product like co2signal:
In [211]:
co2['hour'] = co2['timestamp'].apply(hour)
In [223]:
co2_hours = co2['co2intensity'].groupby([co2['country'], co2['hour']]).agg([np.min, np.mean, np.std, np.max]).reset_index()
We can then sort countries per the standard deviation of the carbon intensity for a given hour:
In [227]:
co2_hours.sort_values('std', ascending=False).head()
Out[227]:
It looks like the Montenegro
shows a hight standard deviation, let's print a similar error bar for that country:
In [228]:
LayeredChart(co2,
description='A error bar plot showing mean, min, and max of carbon intensity for ME electricity',
layers=[Chart().mark_rule().encode(
x='hour:O',
y=Y('min(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
y2='max(co2intensity):Q',
), Chart().mark_tick().encode(
size=Size(
value=5.0,
),
x='hour:O',
y=Y('min(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
), Chart().mark_tick().encode(
size=Size(
value=5.0,
),
x='hour:O',
y=Y('max(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
), Chart().mark_point().encode(
size=Size(
value=2.0,
),
x='hour:O',
y=Y('mean(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
)],
).transform_data(
filter=(expr.df.country == 'ME')
)
We should double check data sources here and especially understand if a bug in the data could explain the low carbon emissions of Montenegro. We print below the dates / hours for which the carbon intensity is the lowest:
In [236]:
co2[co2['country'] == 'ME'].sort_values('co2intensity').head()
Out[236]:
Nonetheless, it is interesting to compare the variation of carbon intensity per hour with Norway, which has less variance for a given hour:
In [238]:
LayeredChart(co2,
description='A error bar plot showing mean, min, and max of carbon intensity for NO electricity',
layers=[Chart().mark_rule().encode(
x='hour:O',
y=Y('min(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
y2='max(co2intensity):Q',
), Chart().mark_tick().encode(
size=Size(
value=5.0,
),
x='hour:O',
y=Y('min(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
), Chart().mark_tick().encode(
size=Size(
value=5.0,
),
x='hour:O',
y=Y('max(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
), Chart().mark_point().encode(
size=Size(
value=2.0,
),
x='hour:O',
y=Y('mean(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
)],
).transform_data(
filter=(expr.df.country == 'NO')
)
In this section we try to combine the analysis detailed above. In order to do this, we can build an index that captures the variation of the two metrics.
We first start by plotting the average price and the average carbon intensity per country in a scatterplot:
In [239]:
Chart(carbon_price,
description = 'Scatterplot of price and carbon intensity for EU countries').mark_text().encode(
x=X('mean(price):Q',
axis=Axis(
title='Price Ahead (Euro per MWh)',
),
),
y=Y('mean(co2intensity):Q',
axis=Axis(
title='Carbon Intensity (gCo2 per KWh)',
),
),
text='country'
)
It is also interesting to show the variation, both in terms of price and carbon intensity in few scatterplots. We present a subset of the countries in the scatterplots below:
In [242]:
Chart(carbon_price[carbon_price['country'].isin(['ES', 'FR', 'BE', 'SI', 'FI'])],
description="Price per country",
).mark_circle(
opacity = 0.3,
).encode(
column='country:N',
x=X('price:Q'),
y=Y('co2intensity:Q'),
color='country:N'
)
Those charts raise different questions:
We can then blend both carbon intensity and price in one metric. We build this metric by simply dividing the carbon intensity by the price. This gives the Amount (in Kg) of Co2 Emitted for 1 Euro Invested in Electricity
:
In [267]:
carbon_price['euro_per_kg_carbon'] = carbon_price['co2intensity'] / carbon_price['price']
Below is a ranked bar chart per country:
In [272]:
Chart(carbon_price['euro_per_kg_carbon'].groupby(carbon_price['country']).mean().reset_index(),
description='A bar chart displaying the price (in euro) per carbon (in MWh) emitted by the electricity production',
).mark_bar().encode(
x=X('euro_per_kg_carbon:Q',
axis=Axis(
title='Average Carbon Intensity (in Kg) for 1 Euro Spent in Electricity (Subset of EU countries)',
),
),
y=Y('country:N', sort=SortField(field='euro_per_kg_carbon', order='descending', op='sum')),
)
In [269]:
carbon_price['hour'] = carbon_price['timestamp'].apply(hour)
In [273]:
LayeredChart(carbon_price,
description='A error bar plot showing mean, min, and max of carbon intensity for NO electricity',
layers=[Chart().mark_rule().encode(
x='hour:O',
y=Y('min(euro_per_kg_carbon):Q',
axis=Axis(
title='Carbon Intensity (in Kg) for 1 Euro Spent in Electricity',
),
),
y2='max(euro_per_kg_carbon):Q',
), Chart().mark_tick().encode(
size=Size(
value=5.0,
),
x='hour:O',
y=Y('min(euro_per_kg_carbon):Q',
axis=Axis(
title='Carbon Intensity (in Kg) for 1 Euro Spent in Electricity',
),
),
), Chart().mark_tick().encode(
size=Size(
value=5.0,
),
x='hour:O',
y=Y('max(euro_per_kg_carbon):Q',
axis=Axis(
title='Carbon Intensity (in Kg) for 1 Euro Spent in Electricity',
),
),
), Chart().mark_point().encode(
size=Size(
value=2.0,
),
x='hour:O',
y=Y('mean(euro_per_kg_carbon):Q',
axis=Axis(
title='Carbon Intensity (in Kg) for 1 Euro Spent in Electricity',
),
),
)],
).transform_data(
filter=(expr.df.country == 'PL')
)
As next steps, it would be interesting to: